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Abstract: Development of neural and sensory primordia at the early stages of 
embryogenesis depends on the activity of two Bl Sox transcription factors, Sox2 and Sox3. 
The embryonic expression patterns of the Sox2 and Sox3 genes are similar, yet they show 
gene-unique features. We screened for enhancers of the 231-kb genomic region 
encompassing Sox2 of chicken, and identified 13 new enhancers that showed activity in 
different domains of the neuro-sensory primordia. Combined with the three ^oxi-proximal 
enhancers determined previously, at least 16 enhancers were involved in Sox3 regulation. 
Starting from the NPl enhancer, more enhancers with different specificities are activated in 
sequence, resulting in complex overlapping patterns of enhancer activities. NPl was 
activated in the caudal lateral epiblast adjacent to the posterior growing end of neural plate, 
and by the combined action of Wnt and Fgf signaling, similar to the Sox2 Nl enhancer 
involved in neural/mesodermal dichotomous cell lineage segregation. The Sox3 D5 
enhancer and Sox2 N3 enhancer were also activated similarly in the diencephalon, optic 
vesicle and lens placode, suggesting analogies in their regulation. In general, however, the 
specificities of the enhancers were not identical between SoxS and Sox2, including the 
cases of the NPl and D5 enhancers. 
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1. Introduction 

Group Bl Sox genes, Soxl, Sox2 and Sox3, encode HMG-domain transcription factors that show 
almost identical activity in the activation of target genes in transactivation assays [1-3], which is 
consistent with the model of their derivation from the same ancestral gene as a consequence of genome 
duplications [4]. In embryos, these Bl Sox factors regulate neuro-sensory development [5] in a 
functionally redundant manner when their expression overlaps in tissue. In most vertebrate species, the 
expression of Sox2 and Sox3, which are activated in the early stages of embryogenesis, occurs 
prominently in the epiblast and neuro-sensory primordia [5-13]. In confrast, Soxl is activated at much 
later stages than Soxl and Sox3 [5-7,14,15]. 

The most compelling evidence for the redundant functions and resulting functional compensation 
among Bl Sox genes has been provided by a study using zebrafish, where the developmental 
defects became evident only when activities of all Bl Sox genes expressed at early stages were 
downregulated [16]. In mouse embryos, the functional compensation between Soxl and Sox3 has been 
indicated by the occurrence of embryonic lethality trom Soxl+I-; Sox3-l- compound mutations, despite 
the viability of the Soxl+I- and Sox3-l- animals [17]. In addition, Sox3 activity compensates for the 
loss of Soxl expression in the epiblast of post-gastrulation embryos [18]. 

The consequence of sharing of functions among the Bl Sox genes is that the targeted disruption of 
Soxl, Soxl or Sox3 resulted in defective pheno types only in the specific organs uniquely expressing 
these transcription factors, ^oxi -mutant mice show abnormal lenses [19] and ventral striatum [20]. 
5ox2-mutant embryos die around the implantation stage, because Soxl is the only Bl Sox expressed at 
that stage [21]. 5ox3 -deficient embryos specifically develop abnormalities in the pituitary gland [22], 
gonad [23], and pharyngeal arch morphogenesis [17]. 

Functional dominance between Soxl and Sox3 depends on the animal species. In amniotes, e.g., 
human, mouse and chicken, Soxl expression covers the entire neural primordia, and it is considered 
the lead transcription factor gene in primordial neural cells [24,25]. However, in lower vertebrates, 
Sox3 expression is more prevalent in the neural primordia than Soxl expression [8,9,12,26], suggesting 
an evolutionary shift in the major regulatory processes in early neurogenesis from ^oxi-centered 
regulation to ^oxi-centered regulation in order to fulfill the equivalent Bl Sox functions [4]. Although the 
expression domains of Soxl and Sox3 overlap extensively, their expression patterns are not identical, 
indicating differences in the regulation of these two Bl Sox genes. However, the regulation of Soxl 
and Sox3 must be coordinated in the developmental processes that depend on the overall Bl Sox 
activity level. 

Therefore, the regulation of the Soxl and Sox3 genes is essential for the proper development of 
neural and sensory tissues. We previously identified more than 20 enhancers that are distributed in a 
200-kb genomic region encompassing the Soxl gene [27,28]. Each of these Soxl enhancers was 
regulated in a unique manner, reflecting particular mechanisms that operate in specific tissues and/or at 
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specific stages of neuro-sensory development [3,18,29,30]. Many of these enhancers are conserved in 
DNA sequences across vertebrate species [4,18,27,28], indicating strong conservation of the 
mechanisms of tissue regulation involving Sox2 fiinction. The N2 enhancer, which is responsible 
for Sox2 activation in the epiblast and early anterior neural plate, is regulated by Zic, Pou and Otx 
factors [18,31]. In contrast, the Sox2 Nl enhancer of Sox2 is activated in the caudal lateral epiblast 
adjacent to the primitive streak. Axial stem cells, which are bipotential precursors for the neural plate 
and the paraxial mesoderm, reside in the caudal lateral epiblast [32], and produce neural 
and mesodermal cells, depending on whether Sox2 or Tbx6 is activated [33]. In addition, it has been 
shown that both axial stem cell maintenance and Nl enhancer activation depend on Wnt and Fgf 
signals [29,32-34]. 

In this study, we report a systematic survey and characterization of the enhancers that regulate 
the Sox3 gene in neuro-sensory development. Many enhancers were found in the 23 1 -kb span of the 
/Soxi-encompassing chicken genomic region, as was the case for Sox2 enhancers. In some cases, the 
enhancers were similarly regulated between the Sox3 and Sox2 genes. However, the varieties of 
enhancer specificities in spatial and temporal terms were substantially different between these genes. 
The Sox3 NPl enhancer that was activated in the caudal lateral epiblast was analyzed in detail because 
of its similarity to the Sox2 Nl enhancer. The SoxS NPl enhancer was found to be regulated by Wnt 
and Fgf signals, similar to the Sox2 Nl enhancer, which is consistent with the essential involvement of 
Wnt and Fgf signaling in axial stem cell regulation [32]. 

A previous study using transgenic mouse embryos identified three putative enhancers within the 
genomic span, 3 kb upstream and downstream of the Sox3 gene, which directs gene expression in 
distinct domains of the neural primordia [35]. In addition, an analogous Sox3 -proximal DNA sequence 
of Xenopus laevis can reproduce a part of the Sox3 expression patterns in transgenic frogs [12]. Our 
present study screened a wider genomic region and found more varieties of enhancers that regulate the 
Sox3 gene in neuro-sensory development. 

2. Results 

2.1. Distribution of the Conserved Sequence Blocks (CSBs) in the Region Surrounding the Sox3 Gene 
Locus in Various Vertebrate Species 

The Sox3 gene is located on the X-chromosome in mammals, but it is autosomal in other vertebrate 
species. It is located on chromosome 4 in chicken, on scaffold GLl 72698.1 (without chromosomal 
assignment) m Xenopus tropicalis, on chromosome 14 in zebrafish, and on chromosome 10 in medaka 
(Figure SI). As the majority of enhancer sequences are included in the sequence blocks that are 
strongly conserved across the species (>60% DNA sequence identity over a length of 100 bp) [27], we 
compared the distribution of CSBs. As shown in Figure 1 and Table SI, many sequence blocks 
conserved between chicken and mammals and/or between chicken and Xenopus were identified both 
upstream and downstream of the Sox3 gene. In fish genomes, a fraction of the blocks were identified, 
but the degrees of sequence conservation were generally low (data not shown). CSBs present in the 
chicken genome were numbered from 1 to 47. These blocks were located in the region that was 
between 134 kb upstream and 97 kb downstream relative to the Sox3 open reading frame (ORF) start 
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site (23 1 kb total), whereas in mammalian genomes the same conserved sequences were distributed in 
wider regions (940 kb in the mouse and 1.4 Mb in the human). Upstream CSBs from block 1 to 11 
tend to be shorter (average 163 bp) compared to the entire CSBs (average 309 bp, excluding ~1 kb 
block 23 that included Sox3 ORF). However, downstream of block 47, no CSBs were found between 
chicken and other animal species. From these observations, the 23 1 -kb span of the chicken genome 
encompassing the Sox3 gene was subjected to an enhancer survey. 

Figure 1. Comparison of the genome sequence organization of the ^oxi-encompassing 
regions of human, mouse, chicken and Xenopus. The arrangement of the conserved 
sequence blocks (CSBs) that were numbered 1 to 47 in the region, between 134 kb 
upstream and 97 kb downstream, of the chicken SoxS gene. A CSB is defined as a 
sequence that shows > 60% base identity compared with the chicken genome of more than 
a 100 bp sequence. The corresponding CSBs in different genomes bear the same numbers 
as in the chicken. The spacing between CSBs are similar for chicken and Xenopus 
genomes, but are widened in mammalian genomes. Human CSBl positioned at -970 kb is 
not included in this Figure. Downstream, the correspondence between the mammalian and 
chicken sequences is lost beyond CSB47. The genomic coordinates of the CSBs in each 
genome are given in Table SI. 
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2.2. Screening for Enhancer Sequences and Their Characterization 



We utilized the tkEGFP vector which by itself does not express EGFP upon electroporation in 
chicken embryos but expresses EGFP in a tissue-specific manner when an enhancer-bearing genomic 
fragment is inserted [27,36]. We initially analyzed the ^oxJ-proximal 50 kb region using bacteriophage 
clones derived from a chicken genome library. The overlapping DNA fragments CI to C13 were 
prepared (Figure 2), inserted into the tkEGFP vector, and assessed for enhancer activities. Two regions 
of the genomic sequence exhibited enhancer activities: C4/5 in the st. 11 lens, and C12/13 in st. 11 
diencephalon and spinal cord (Figures 3 and 4, Table S2). The C12 sequence and the C13 sequence 
included in C12 exhibited identical enhancer activity. The C4/5 sequence included three CSBs (17, 18 
and 19). The cloning of individual CSBs indicated that the CSB19 sequence was responsible for the 
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lens enhancer activity of C4/5. The CSB26 sequence included in the C12 sequence exhibited enhancer 
activity in the diencephalon and spinal cord, which was identical to the activity of the C12 sequence. 

Figure 2. DNA sequences derived from the 5'ox5-encompassing genome region of the 
chicken and their analysis for enhancer activities. Distribution of CSBs 1-47 (black bars) 
upstream and downstream of the Sox3 gene. CSB23 includes the Sox3 open reading frame 
(ORF). The sequences of the central region, C1-C13, were derived from bacteriophage 
clones. The ~5 kb sequences that were upstream (Ul to U28) and downstream (Dl to D 17) 
with 1-kb terminal overlaps were derived from BAC clones. The DNA sequences indicated 
by red horizontal bars exhibited enhancer activity. CSBs 19, 26 and 27 that showed the 
same activity as the original DNA sequences are marked by red vertical bars. The 
nucleotide positions of these sequences and the statistics of the enhancer analysis are 
presented in Table S2. 
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Figure 3. Enhancer activities in placode derivatives of the ^oxi-flanking sequences or 
CSBs in electroporated chicken embryos at st. 11-12, as indicated by the EGFP expression, 
in comparison with Sox3 in situ hybridization (left). Top: Bright-field images with 
indication of developmental stages. V, ventral view; D, dorsal view. Bottom: EGFP 
fluorescence images of the same field. The bars indicate 200 |Lim. Di, diencephalon; Ep, 
epibranchial placode; LP, lens placode; Mes, mesencephalon; OP, otic placode; Rho, 
rhombencephalon; Tel, telencephalon. 
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These enhancers accounted for only subdomains of SoxS expression in embryos. Therefore, we 
extended the region of the enhancer survey to a 23 1 -kb genomic span. We identified two B AC clones 
that covered the upstream and downstream regions of the central 50-kb region. We prepared a series 
of ~5 kb DNA sequences that spanned the upstream and downstream regions with 1 kb terminal 
overlaps, resulting in 28 upstream sequences (U1-U28) and 17 downstream sequences (D1-D17) that 
were external to the central 50-kb sequence (Figure 2), and examined their enhancer activities. The 
number of electroporated specimens for each DNA sequence is indicated in Table S2, where the same 
specificity of an enhancer was reproducibly demonstrated. 

Figure 4. Sequential activation of Sox3 enhancers that show distinct tissue specificities in 
the neural primordia. The Dl activity was represented by the NPl enhancer sequence (see 
also Figure 5C). The top row: SoxS in situ hybridization patterns in chicken embryos at 
respective developmental stages. Other rows, EGFP expression reflecting the specificity of 
the enhancers in comparison with the bright-field image of the same embryo. The data of 
an enhancer at different developmental stages was taken from the same embryo. V, ventral 
view; D, dorsal view. The white triangles indicate the positions of Hensen's node. All 
photographs are shown on the same scale. The bar indicates 500 |Lim. Di, diencephalon; 
Mes, mesencephalon; Rho, rhombencephalon; SC, spinal cord; Tel, telencephalon. 
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In the upstream region, we identified only one sequence, U3, which showed enhancer activity in the 
epibranchial placode region at st. 11. However, in the downstream 97 kb region, ten sequences showed 
enhancer activities of various tissue specificities (Figures 3 and 4, Table S2). The U3, Dl, D3, D5 and 
D8 sequences showed enhancer activities in placode derivatives (Figure 3). The D3 sequence showed 
an otic placode-specific enhancer activity, while the D8 sequence displayed an epibranchial 
placode-specific activity that was similar to the aforementioned U3 enhancer. Other placodal 
enhancers also showed enhancer activity in the CNS as well. The Dl, D4, D5, D6, Dll, D12, D13 and 
D15 sequences showed enhancer activities in various portions of the developing CNS (Figure 4). The 
enhancer activity of the Dl sequence was of particular interest because of its strong activity in the 
caudal lateral epiblast, which is similar to the Sox2 Nl enhancer [27,29]. The Dl enhancer had 
additional activity in the posterior neural plate that forms the rhombencephalon and spinal cord. The 
Dl sequence included CSB27, and the isolated CSB27 sequence alone displayed the same activity (see 
Figure 5C, below). This CSB enhancer was renamed NPl and analyzed flirther. The sequence D5 
enhancer showed activity in the CNS anterior to the rhombencephalon and in the lens placode, 
indicating a resemblance to the N3 enhancer of Sox2. 

2.3. The Time Course of the Activation of Neural Enhancers 

The tissue territories of activation of the Sox3 neural enhancers are compared in Figure 4 following 
the time course. Shortly after the electroporation of the tkEGFP reporter vector, the NPl (CSB27) 
enhancer was activated at st.5 in the region abutting the anterior primitive streak, which is similar to 
the Sox2 Nl enhancer [27,29]. Synchronous with primitive streak regression and the posterior 
migration of Hensen's node, the peak position of the NPl enhancer activity moved posteriorly. A 
moderate level of enhancer activity remained in the region where the enhancer was once activated, 
namely, in the rhombencephalon and spinal cord. Following the activation of NPl, the D13 and D15 
enhancers were activated at st. 8, in the prospective diencephalon, rhombencephalon and anterior 
spinal cord (D13/15), and in the prospective telencephalon (D15). Later at st. 9, enhancers D4, D5 and 
Dll were activated in the prospective telencephalon (D4), the prospective di/mesencephalon (D5) and 
the spinal cord (Dll). At st. 11, the CSB26 enhancer was activated in the prospective diencephalon 
and the medial axial levels of the spinal cord, and the D12 enhancer was activated in the ventral 
diencephalon. Next, at st. 15, the D6 enhancer was activated in the mes/rhombencephalon and spinal 
cord. These enhancers produced various overlapping patterns of their activities in the CNS primordium. 

2.4. Functional Dissection of the NPl Enhancer Region 

In order to examine the NPl enhancer regulation that was analogous to the Sox2 Nl enhancer 
activation in axial stem cells, the DNA sequence of the NPl enhancer was investigated in detail. 
A comparison of the CSB27 sequence (369 bp in the chicken) in the four vertebrate species indicated 
the existence of two conspicuous blocks of high sequence conservation, which were positions 52-177 
(Block A) and 185-269 (Block B) (Figure 5 A). 

Starting from a 1681 -bp sequence, which included the 369-bp CSB27 sequence, the specific region 
that accounted for the enhancer sequence was narrowed down in the following steps (Figure 5B). The 
CSB27 sequence showed activity that was identical to the 1681 -bp sequence, whereas the immediate 
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upstream 941 -bp sequence showed no activity (Figure 5B(a)). The deletion analysis shown in Figure 5B(b) 
indicated that the 52 to 269-bp region of the CSB27 sequence was sufficient for flill enhancer activity, 
whereas the sequences bearing only one of the two high-conservation blocks (1-180 bp or 181-369 bp) did 
not show enhancer activity (Figure 5C). The 5' step-wise deletions indicated that the 1 to 81 -bp region 
was dispensable, but fiirther deletion to the 92 -bp position inactivated the enhancer. The 10-bp 3' 
deletion from the 82 to 269-bp sequence inactivated the enhancer. Taken together, these results 
indicated that the 82 to 269-bp sequence was required for NPl enhancer activity. 

Figure 5. The sequence requirements for NPl enhancer activity. A. Alignment of the 
CSB27 sequence of human, mouse, chicken and Xenopus. Two strongly conserved blocks, 
A and B, are highlighted in blue and pink, respectively. Other conserved bases are shaded 
in gray. The nucleotide positions in the chicken sequence are given at the top. B. Deletion 
analysis of the NPl enhancer. The thick dark red lines indicate the DNA sequences 
showing flill NPl enhancer activity, while the thin gray lines indicate the sequences 
without enhancer activity, (a) A 1,681 -bp DNA sequence (NPl [1681 bp]) that included the 
CSB27 sequence showed enhancer activity that was identical to the CSB27 (NPl [369 bp]) 
sequence, (b) The 5' and 3' halves of the CSB27 sequence, 1-180 bp and 181-369 bp, 
which included only one of the strongly conserved blocks, A and B, respectively, did not 
show NPl enhancer activity. NPl [2 18 bp], consisting of the strongly conserved blocks A 
and B, displayed fLiU NPl activity. While a 30-bp deletion from the 5' end (NPl [188 bp]) 
did not affect the enhancer activity, a 40-bp deletion (92-269 bp) inactivated the enhancer. 
The sequence of 82-259 bp with a 3' 10-bp deletion from NPl [188 bp] lost enhancer 
activity. Thus, NPl [188 bp] was determined as the minimal DNA sequence that elicited 
NPl enhancer activity that was definable from external deletions. C. Representative 
examples used to assess NPl enhancer activity using chicken embryo electroporation. Both 
bright field and EGFP fluorescence images are shown for each sequence tested. Dl [5.0 kb], 
NPl [369 bp] and NPl [2 18 bp] sequences showed strong activity (++), while the 1-180 bp 
(including Block A) and 181-369 bp (including Block B) sequences were inactive (-). The 
bar indicates 500 |um. 
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Figure 5, Cont. 
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2.5. Regulation of the NPl Enhancer by Wnt and Fgf Signals 

Tlie tissue domain tliat &st activated tlie Sox3 NPl enhancer, namely the caudal lateral epiblast 
abutting the primitive streak, overlapped with that of the Sox2 Nl enhancer as indicated in Figure 6A(a). 
The Nl enhancer is activated by the combined action of Wnt and Fgf signals [29,33], which is 
consistent with the requirement of both signals in the maintenance of the axial stem cells that activate 
the Nl enhancer [32,34]. Thus, whether the NPl enhancer is also regulated by Wnt and Fgf signals 
was investigated. The 52-269 sequence of CSB27 contained several potential Lefl binding sequences 
and putative Fgf-responsive elements (Figure 6A(b)). 

When the expression vector for Dkkl, which is an antagonist of canonical Wnt signaling, was 
co-electroporated with NPl-tkEGFP, the NPl enhancer was inactivated (Figure 6B(b)). In contrast, 
when the expression vector for stabilized P-catenin, which constitutively activates the canonical Wnt 
signal pathway, was co-electroporated, the NPl enhancer was activated in the entire embryo (Figure 
6B(c)). These results demonstrated the canonical Wnt signal-dependent activation of the NPl 
enhancer. Analogously, co-electroporation of a vector to express a soluble form of the Fgf receptor 1 
[FGFRlc(ECD)-Fc(IgG2a)] that titrates Fgf molecules, or addition to the culture medium of SU5402, 
a specific inhibitor of Fgf receptor tyrosine kinase, inactivated the enhancer (Figure 6B(d)-(f)), 
indicating that NPl enhancer activation also depended on Fgf signaling. These findings indicated that 
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the Sox3 NPl enhancer was regulated by Wnt and Fgf signals in a manner similar to that of the Sox2 
Nl enhancer, which suggests that both Sox2 and Sox3 genes are regulated in an analogous manner in 
neural/mesodermal-bipotential axial stem cells. This would allow the regulation of these Bl Sox genes 
to be coordinated with the maintenance and differentiation of axial stem cells, which are also regulated 
by Wnt and Fgf signaling. 

Figure 6. Regulation of the SoxS NPl enhancer. A. Comparison of the Sox3 NPl enhancer 
and Sox2 Nl enhancer, (a) Activities of the NPl enhancer (NPl-tkEGFP) and Nl enhancer 
(Nl-tkmRFPl) in the same electroporated chicken embryo at st. 9. The NPl activity was 
recorded using a shorter exposure compared to other panels, in order to show the 
distribution of strong enhancer activity posterior to Hensen's node. The node position is 
indicated by an arrowhead, (b) Comparison of the DNA sequences of SoxS NPl [2 18 bp] 
and the 298-bp Sox2 Nl sequence [27,29]. The Lef-l binding motifs are highlighted in 
pink. The Fgf-responsive element TGTGAC of Nl [29], as well as the related sequences in 
Nl and NPl which are candidate elements for an Fgf response, are highlighted in blue. 
B. The dependence of NPl enhancer activity on Wnt and Fgf signaling. Embryos were 
electroporated with the NPl [2 18 bp]-tkEGFP, pCMV/SV2-mRFPl (electroporation 
control), and expression vectors for effectors modulating Wnt/Fgf signaling or treated with 
SU5402. After 8 hours when the embryos reached ~st. 8, the effects of the effectors on 
NPl enhancer activities were assessed. The number of cases showing the same response as 
shown in the representative panel among the treated embryos is indicated in the NPl -EGFP 
panels, (a) A control embryo with no effectors, (b) Dkkl expression that blocked canonical 
Wnt signaling inactivated the NPl enhancer, (c) Expression of stabilized (3-catenin that 
constitutive ly activated canonical Wnt signaling resulted in the strong activation of the 
NPl enhancer in the entire embryonic tissue, (d) Expression of a soluble form of Fgf 
receptor that titrates out Fgf ligands inactivated the NPl enhancer, (e) and (f) Addition of 
SU5402 to the culture also inactivated the NPl enhancer. SU5402 was added at 25 |liM (e) 
or 125 |J,M (f) in 50 |li1 yolk solution that overlaid an embryo [36], but not in the supporting 
agar medium. The bars indicate 500 |u,m. 



St. 9 



Sox3 
NP1 




(b) 



Sox3NP1[218 bp] 



52 151 
CGaCCAGACTGGATCAATACAGAGAAGCGCTCTTTGATGGACTTTCAGGCCCTGTGATCTCCCAGCCCTGGGGGTTTAAAGGATGAGCATATCAATCCA 

152 251 
GGACCACTGAACTGCATTTCCATTGCATGAAGGCTCCTCCTCCGGGCTTTGTCCGGACAAACTCGGCTCCTCCTACTTTTGTCACCGTTCTTTCATGCTT 

252 269 
TGTTTAGCAAACACAAAA 



Sox2 N1 



1 IM 

AAGAT6TATAAATATCAAAGTGAAGGAGCCCGAGTAAGTCTTTCTAGAAGCGAGGAGGAAGCTTAAGCAGCTTTCTTTAATGGTGATTTGTAGCTCTGTA 
101 200 
AGGGGCACGGATAGCAAATACCCTGCAGGTGCATGTTGTAACACTGCCATCCGGACTTTAATGTAGATTACTCTCCAAa^j^jj^mm^^^ 

201 298 
ACCCTTTACAATTGCCTGTGACGAACCGCCAGTCTCAGTTTTTTTTCTTTGAATAATGCCTTAAGGGTAAGTCCTCGGGGCTTTTAAAGATCTCCAGC 



N1 core sequence 
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B (a) No effector 

NP1-EGFP CMV-mRFP1 



Figure 6. Cont. 
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3, Discussion 



3.1. Regulation of the SoxS Gene 

The basal promoter of the human SoxS gene has been characterized as having binding sites for 
several tissue-nonspecific transcription factors [37-39], suggesting a limited contribution of the 
promoter in the tissue-specific regulation of the SoxS gene. A survey of the mouse genomic sequences 
within 3 kb of the SoxS gene with transgenic mouse embryos identified three regulatory regions that 
dictate transgene expression: the FB sequence of approximately 200 bp long that was located at 
approximately -1.2 kb in the coordinates given in Figure 1 for expression in the forebrain/midbrain 
/hindbrain; the V2 sequence that was located between -1.1 and 0.3 kb for expression in the V2 
interneuron domain of the spinal cord; and the PNT-R sequence located between 2.8 and 4.1 kb, 
relative to the SoxS translational start site for expression in the posterior neural tube, rhombencephalon 
and otic vesicle [35]. Both the FB and V2 sequences were included in the C7 sequence, and PNT-R 
was included in the C8 sequence (Figure 1). Generation of the partial expression pattern of 5*0x5 using 
aX. laevis ^oxi-proximal sequence has also been reported [12]. 
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In our present study, a region of a total of 23 1 kb in the chicken genome encompassing the SoxS 
locus was systematically surveyed for enhancers with the region-scanning ~5-kb DNA sequences with 
terminal overlaps, and 13 new enhancers that direct gene expression in specific regions of the neural 
and/or sensory primordia were identified. A combination of these with the mouse-determined 
enhancer-like elements, account for various features of SoxS regulation in early neuro-sensory 
development. However, it should be noted that our survey using chicken embryo electroporation did 
not detect the aforementioned 5ox5-proximal elements that were determined using transgenic mouse 
embryos. One possible cause of this discrepancy was differences in the assay method. However, it 
probably reflected that a negatively acting element that was located around -5 kb of the SoxS gene 
masked the enhancer activities. In transgenic mouse experiments [35], it has been reported that 
inclusion of this region inhibited forebrain activity of the FB element. Presence of an analogous 
enhancer- inhibitory element has also been reported for the X. laevis SoxS upstream sequence [12]. The 
chicken DNA sequences (C7 and C8) covering the Sox3-proxima\ region were assessed as negative 
and included this -5 kb element, which could have masked the enhancer activities (Figure 2). 

The SoxS expression pattern in the chicken embryos and the activities of the newly identified 
enhancers are compared in Figures 3 and 4. SoxS expression at st. 5 covered both the anterior and 
posterior neural plates, whereas the NPl enhancer was active in the posterior neural plate and 
especially in the posterior axial stem cell region. It is likely that the FB element that was determined 
with the mouse system [35] regulates SoxS expression in the anterior neural plate. After st. 8, various 
enhancers that have activities in the subdomains of the forebrain-midbrain region, i.e., D13, D15, D4, 
D5, Dll, CSB26, D12 and D6, were activated in sequence. This resulted in various overlapping 
patterns of enhancer activities. In addition, at the rhombencephalon level, plural sequences showed 
enhancer activity, namely NPI, DI3, DI5, D6 (Figure 4) and PNT-R [35]. In the spinal cord, many 
enhancers showed non-uniform activity along the antero-posterior axis, with exception of the D15 
enhancer. The D13 and D6 enhancers showed strong activity in the anterior half of the spinal cord at 
St. 11, whereas the Dl 1 and CSB26 enhancers showed activity in the medial portion of the spinal cord 
(Figure 4). In the posterior region of the spinal cord, the NPl and PNT-R enhancers exhibited 
activities. These observations suggested modular regulation of SoxS expression along the 
antero-posterior axis of the spinal cord that depended on the differential activities of the enhancers. 

In sensory tissue development, the D5 enhancer was activated in the lens placode area at st. 11, and 
the CSB19 enhancer was activated at st. 12 and continued to be active until the later stages of lens 
development (Figure 3). Three enhancers exhibiting activity in the otic placode/vesicle have been 
identified: the otic placode-specific D3 enhancer, NPl and PNT-R [35]. NPl and PNT-R showed 
activity in the posterior neural plate, and the activation of these enhancers in both the otic 
placode/vesicle and the posterior spinal cord suggested a common signaling input in the development 
of the two different tissues. The U3 and D8 enhancers showed activities in the epibranchial 
placode regions. 

S.2. Comparison of the Regulation of SoxS and Sox2 

We have previously shown that during the early stages of neural plate development, Sox2 regulation 
is divided into non-overlapping anterior and posterior territories that are regulated by the N2 and Nl 
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enhancers, respectively [5,27,32]. This reflects the fact that the anterior and posterior neural plates 
develop from the epiblast through distinct cellular mechanisms [18,32,33]. Participation of the two 
enhancers FB and NPl in Sox3 regulation in the anterior and posterior neural plate development, 
respectively, indicated that Sox2 and Sox3 are regulated on the basis of the same antero-posterior 
spatial division of the neural plate. 

In a detailed analysis, an important similarity was found in the regulation of the Sox3 NPl and 
Sox2 Nl enhancers. Both of these enhancers were activated in the epiblastic tissue abutting the anterior 
primitive streak in which the axial stem cells reside. Both were activated by the combined action 
of Wnt and Fgf signals, as inhibition of either of these signals attenuated their enhancer activities 
(Figure 6B) [29,33]. Multiple Lefl -binding sites are found in the NPl sequence (Figure 6A(b)), and a 
pair of flinctional Lefl-binding sites have been determined in the Nl sequence [29,33]. Putative 
Fgf-responsive elements were also indicated in both sequences (Figure 6A(b)). These shared features 
of NPl and Nl allow for the coordinated regulation of Sox3 and Sox2 when the axial stem cells give 
rise to the neural plate and paraxial mesoderm. However, their regulation was not identical. The 
arrangements of the Lefl sites and putative Fgf responsive elements were very different between NPl 
and Nl. In contrast to the case of the Nl enhancer, which was strongly downregulated when the neural 
plate was formed, the NPl enhancer maintained its activity in the spinal cord and was also activated in 
the otic vesicle. 

Another analogous case had similar enhancer specificities of the Sox3 D5 enhancer and Sox2 N3 
enhancer. These enhancers were activated in the forebrain, including the optic vesicle, and in the lens 
placode in st. II chicken embryos. It has been shown that the Sox2 N3 enhancer is activated by the 
cooperative action of Sox2 and Pax6 in the diencephalon, optic vesicle and lens placode [3]. 
Considering the similarities in the tissue specificity of the enhancer activity, it is possible that an 
analogous regulation is involved in the activation of the D5 enhancer. However, their regulation was 
not identical, while the D5 enhancer was active in the telencephalon at st. II, the N3 enhancer lacked 
activity therein. Thus, the case of the D5 enhancer represented an additional example of the analogy 
between the Sox3 and Sox2 regulatory mechanisms, yet with an appreciable difference. 

Regulation of the Sox3 gene in the neural plate progressively became more complex (Figure 4). 
Following the creation of the antero-posterior subdivisions by the FB and NPl enhancers in the 
enhancer territories of the forming neural plate, more Sox3 enhancers with various regional specificities 
were activated in sequence. The same scenario holds true for 5'ox2-regulating enhancers [27,28]. However, 
the activities of individual enhancers differed between Sox3 and Sox2 with respect to then- 
aforementioned specificities. 

The genomic arrangements of the enhancers also differed between the Sox3 and Sox2 genes. For 
instance, the D5 enhancer that was activated in the ocular primordia was located -50 kb downstream 
of the Sox3 gene, while the N3 enhancer was positioned -15 kb upstream of the Sox2 gene. Whereas 
the major Sox3 enhancers were distributed in the region between -20 kb and 80 kb relative to the Sox3 
gene [35] (Figure 2), the Sox2 enhancers were more widely distributed on both sides of the gene over 
the range of 200 kb [28]. Thus, with respect to their individual activities and genome organizations, the 
enhancers substantially differed between the Sox3 and Sox2 genes. 
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3.3. Phylogenetic Conservation of Sox3 Regulation 

The wide expression of Sox3 in the neuro-sensory primordia is common to most vertebrate species, 
whereas the wide expression of Sox2 is hmited to amniote animals [4]. Reflecting this fact, the 
individual CSBs (potential enhancers) of the Sox3 locus were strongly conserved among the genomes 
of Xenopus, chicken and mammals (Figure 1), which is in contrast to the high degree of conservation 
of <Sox2-associated CSBs that are confined to amniotes [4]. 

Along with these general features, species-dependent episodic changes in the enhancers were also 
observed. The sequence of lens-specific CSB19 enhancer of 5*0x5 is conserved poorly in the mouse 
genome, and this may account for the absence of Sox3 expression during mouse lens development [6]. 
However, the Sox3 D3 sequence, which showed purely otic placode-specific enhancer activity (Figure 3), 
did not contain CSBs (Figure 2), suggesting that this enhancer was unique to avian genomes. 

Conservation of genomic sequences downstream of Sox3, as indicated by the occurrence of CSBs 
common to chicken, Xenopus and mammals, ends beyond CSB47, which is located at approximately 
90 kb in the chicken genome, whereas the conservation among mammalian sequences is extended a 
further 3'. Both upstream and downstream genomic regions of the Sox3 gene appeared to be expanded 
in the mammalian genomes compared to the chicken and Xenopus genomes. An interesting issue is 
whether this genomic expansion is correlated with the location of Sox3 on the X-chromosome in 
mammalian species. The arrangements of landmark genes in syntenic chromosomal regions indicated 
extensive rearrangements of the genomic domains around the Sox3 gene in various species (Figure SI), 
suggesting that the genomic region investigated in this study included the major components of the 
Sox3 regulatory sequences. Alternatively, the Sox3 gene may be regulated by distant-acting regulatory 
sequences that withstand extensive genomic rearrangements. 

4. Materials and methods 

4.1. Genome Sequences of the Sox3 Locus 

To analyze the enhancers and conserved sequence blocks (CSBs) in the 231-kb region spanning the 
Sox3 locus in the chicken genome, we used sequence data from Gallus gallus-4.0 assembly 
(GCA_000002315.2) chromosome 4 genomic scaffold sequence (NW_003763735 GPS_000848988) 
as a reference. The central region sequence, from -23456 to 29859, covered by our bacteriophage 
library [27] was determined by us (DDBJ Accession number AB753847). The translational start site of 
Sox3 was taken as +1 to indicate the genomic location relative to Sox3. This position corresponded to 
Gallus gallus chromosome 4 NW_003763735 sequence position 10474038, Homo sapiens 
chromosome X GRCh37 partial sequence position 139587000 {Sox3 is in reverse orientation), 
Mus musculus chromosome X GRCm38 partial sequence position 60893205 {Sox3 is in reverse 
orientation), Xenopus tropicalis genome scaffold GL172698.1 position 1000090. The CSBs were 
screened using VISTA Browser (http://genome.lbl.gov/vista/) using a threshold of >60% identity over 
a 100-bp length. 
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4.2. Enhancer Screening of the 233-kb Genomic Region and Characterization 

The bacteriophage clones that carry the 5ox3-proximal region sequences, approximately 50 kb, were 
used to produce the ^loxi-proximal sequence segments CI -CI 3, and the two BAC clones CH261-1 12C13 
and CH261-98D21 (ENSEMBL) were used to produce a series of ~5 kb DNA sequences with 1-kb 
terminal overlaps, Ul to U28 (upstream) and Dl to D17 (downstream), by PCR. These isolated 

sequences were individually inserted into a ptkEGFP vector [27], and electroporated at 2 |ug/|u,L into 
the dorsal side of st. 4 chicken embryos. The pCMV-mRFPl vector was also included in order to 
monitor the region of successfiil electroporation. The activation of EGFP expression at various 
developmental stages in New's culture was scored as enhancer activity. 

4.3. Effector-Dependent Regulation of the NPl Enhancer 

pNPl[218 bp]-tkEGFP at 2 ^ig/^iL, pCMV/SV2-mRFPl at 0.6 ^ig/^iL and pCAGGS-based 

expression vectors [40] at 2 |ug/^L for the expression of Dkkl, stabilized (3-catenin [29] or 
rFGFRl(ECD)-rFc(IgG2a) (gift of Claudio Stem) were electroporated in st. 4 embryos in New's 
culture. The impact of the expression of these effectors was assessed 8 hours after electroporation. 
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Supplementary Materials 

Figure SI. The arrangement of genes surrounding the Sox3 locus in human, mouse, 
chicken and Xenopus tropicalis. The landmark genes are highlighted in different colors, 
indicating the occurrence of extensive rearrangements of the chromosomal segments 
among the animal species examined. 
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Table SI, Positioning of the conserved sequence blocks (CSBs) in the human, mouse, chicken and Xenopus genomes. 





Chicken Sox3 locus 


Human Sox3 locus 


Mouse Sox3 locus 


Xenopus Sox3 locus 


CSB 


Positions^ 


Length (bp) 


Positions^ 


Length (bp) 


Human vs. chicken 
Identity (%) 


Positions" 


Length (bp) 


Mouse vs. chiclcen 
identity (%) 


Positions" 


Length (bp) 


Xenopus vs. chicken 
identity (%) 


1 
1 


1 ■I 1 A1 on 


1 on 

izy 


-y /Ul34 ~ -y /UUzo 


1 on 
Izy 


o4 


C0A101 COAAC3 

-joyioi — -joyujj 


1 on 

Izy 


OO 

oz 


Absent 






Z 


1 1 (\QA(\ 1 1 mrtrt 


z41 


AAAA(\(\ AAAA^A 

-4444UU — -444101 


z4U 


15 


^00 AC! ^010'2'7 
-JZZUOJ — -JZlSJ / 


o/n 
z4 / 


/4 


Absent 






1 
J 


1 Ac^n/c 1 (\nc\in 


inn 


-AZojii ~ -AZolji 


JOl 


/ 1 


-DUUzij — -4yyo4i 


111 


/cn 

oy 


1 0AT/;'? 10A'3A'3 

-izy/o/ — -izy3y3 


3 /3 


/;o 
OZ 


A 

4 


-lUJooo — lUJjj / 


1 TO 

1 Jz 


Absent 






-4 /4z 1 J — -4 /4Uy 1 


izj 


/^A 


-IzlojO — Izl /Z3 


1 l/l 
134 


04 


c 

J 


-y /jui — y /4UU 


1 AO 
lUZ 


Absent 






-306UU1 — it) /yuu 


1 AO 
lUZ 


/CO 

oz 


Absent 






O 


-oz /4U — ozjjj 


zUo 


-Z l4 lU 1 — Z /4jUZ 


ZUO 


OO 


-zooijj — zojyju 


OA/1 

ZU4 


j5 


Absent 






1 


0 1 ceo 0 1 y1 CO 


1 A1 
lUl 


Absent 






-zy 1 Vly ~ -zy /UOD 


lie 


oU 


Absent 






Q 
O 


-bl555 ^ -o/zzl 


1 1 ■I 


-zDzJUU ~ -zDziyj 


1 AO 

lUo 


J / 


-zoi /oi — -zoiooy 


lli 


oi 


Absent 






y 


-oUojU ^ -oUjil 


1 OA 

IzU 


-Z406 /U ~ -Z40/3U 


1 O 1 

Izl 


04 


-ZD33 /z — -zD3z4 / 


1 o/; 
Izo 


o4 


Absent 






1 f\ 


-SyjUi ~ -59126 


17o 


-z43Dyz ~ -243410 


1 01 

183 


63 


Absent 






Absent 






1 1 
1 1 


1 QA '^/^A'VQ 


1 AO 

lUz 


Absent 






-z3uyy / — zjuoy / 


1 A1 

lul 


/CI 
01 


Absent 






1 0 

Iz 


-4jjjo — 44yyo 


JOl 


-1 JOO/j 1 JOlDJ 




/j 


-i4y /yz — i4yz8i 


CIA 


/j 


-42034 — 42145 


/I AA 

4yu 


/CO 

OZ 


1 1 


-zy / JO — zyzjy 


4 /o 


-ij /Zjy 1 JO/ /J 




/y 


-13301 / — 1331jl 


40 / 


QA 


-31421 — 3Uy04 


4Do 


/CA 

oU 


14 


-zoJoo -zj4oj 


yuo 


-oo4Ul ~ -o / jzl 


ool 


/o 


-lUl /03 — lUwo /4 


©OA 

oyu 


/y 


-2 /3 /3 — -204 /y 


syj 


o / 


1 c 

1 J 


-^lUol ~ -^uooz 




-JHUJU -J JOJ / 




77 


-J I -JOo^^ 


H-U / 


77 


1 0^■^7 1 
-i yOJ / -1 y^jo 


All 


u J 


lO 


-17784^-17128 


657 


-28051 --27389 


663 


82 


-25441 ---24780 


662 


81 


-18039- -17396 


644 


61 


17 


-14950 ---14611 


340 


-23177 --22843 


335 


76 


-22166^-21840 


327 


69 


-15729- -15389 


341 


60 


18 


-13560 ---13273 


288 


-22619 --22337 


283 


65 


-21624-^-21340 


285 


63 


Absent 






19 


-11574~-11339 


236 


-21444 --21219 


226 


63 


-20545 ---20316 


230 


51" 


-13988- -13757 


232 


71 


20 


-5363 - -4690 


674 


-7397 - -6739 


659 


77 


-6672^-6014 


659 


76 


-7241 - -6571 


671 


60 


21 


^176^-^004 


173 


Absent 






Absent 






-5417 - -5246 


172 


64 


22 


-944 - -792 


153 


-1258 - -1099 


160 


65 


-1253 - -1097 


157 


65 


Absent 
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Table SI. Cont. 





Chicken ^o^i locus 


Human Sox3 locus 


Mouse Sox3 locus 


Xenopus Sox3 locus 


CSB 


Positions^ 


Length 
(bp) 


Positions^ 


Length (bp) 


Human vs. chicken 
identity (%) 


Positions^ 


Length (bp) 


Mouse vs. chicken 
Identity (%) 


Positions" 


Length (bp) 


Xenopus vs. chicken 
Identity (%) 


15 




1 f\o/z 


on 1 1 dzdz 
-6 / — 1 loo 


IOC/I 

1254 


0/ 


OT 1 1 nn 


1 o/:c 
1203 


O/ 


-88 — 909 


1 ACO 

l(J5o 


Tl 

/ 1 








00^ 111/: 
-ZZJ — 1110 


1342 




00^ 1100 

-22j — 1125 


13 j4 










box J CJJO 


1 nc 1 




1 111/; 
1 — lllo 


111/: 
lllo 


00 


1 1 1 oo 
1 — 1120 


1 1 oo 
1 120 


/CO 

DO 


1 AO A 

1 — 924 


AO A 

924 


/O 


z4 


1QQ0 OAQQ 


1 1 o 


TITO 


1 

120 


Oo 


LL^I — 2J0y 


1 OT 

123 


04 


Absent 






Zj 


ZU /Uy ZUooZ 


1 /4 


4o /43 — 46914 


1 /2 


0/ 


/I c /I /:/: /I c/^i 0 
45400 — 45o3o 


1 1 i 


0/ 


Absent 






ZD 


0/1 OQ 1 0/1Q 1 
Z4zy 1 ~ 24^10 


ozo 


j4y 18 — Jjj4i 


/;0/1 
024 


TA 

/U 


D 1401 — j2Uy / 


03 / 


OA 


2283 1 — 234 /U 


t^AC\ 

04U 


7A 
/U 


Z / 


onin'3 ofiyni 
zyiUi ^ zy4/l 


joy 


0492 / — O3Z0O 


JoU 


oi 


c 1 yio/: /:i Toc 
01420 — ol /SD 


3ou 


C 1 

ol 


o^on o/icAC 
2423 / — 24595 


1 CA 

359 


/O 


zo 


llj ^ 55(S5y 


lie 


Absent 






Absent 






OOO/I/I OOAA1 

28804 — 28991 


1 oo 
120 


/:i 
03 


on 


5j/y/^ Jjyij 


1 1 n 

iiy 


1 1 Anil 1 1 Anzc\ 
114033 — 1 14 /DU 


1 1 o 

lis 


/: 1 
Ol 


c\inn'\ A'looo 


1 1 o 

llo 


C 1 

01 


31915 — 5Z\}5<a 


1 oo 

122 


/: 1 
01 


if\ 
5\) 


37o39 ~ 37943 


1 nc 
105 


Absent 






Absent 






CA-7A1 CAOA1 

50701 - 50801 


1 A1 

101 


09 


1 1 


/ITO/IA /11/1QA 


0'^ 1 

ZJ 1 


14ozjy — 14ojUj 


24 / 


^^A 

ou 


1 1 jVUj — 1 101 J / 


2j3 


/^A 


Absent 






10 

oz 


4/1Uj ---^ 4/yo4 


0/CA 

ooU 


ZUSZll — ZUOUOS 


o c c 
ODD 


TA 
/9 


148542 — 149384 


843 


/o 


C AA 11 C AO /I "7 

59U1 1 — 5984 / 


01 "7 

83 / 


03 


55 


/1 01/1 1 /I oo 1 1 


lOI 


zuojyj — zuo /o4 


J /U 


/2 


i4y /20 — iDuuyj 


ITA 

3 /U 


oo 
/z 


^ 1 OQO /iO 1 

01 /y8 — 02102 


305 


/;a 
OU 




jziy+'^ jzo'+o 




ZJjyoi — 




oo 


17^01 S ~ 17^471 

i/JUiO — i/Jt/i 




oo 


UUU^O — O/U/J 




71 


J J 


60282 60540 


259 


314971 - 315223 


253 


67 


212815-^213069 


255 


69 


82061 82322 


262 


56" 


36 


69025 ~ 69166 


142 


325798 - 325935 


138 


61 


235698 - 235833 


136 


62 


Absent 






37 


69205 ~ 69439 


235 


325967 - 326202 


236 


61 


235872 - 236110 


239 


60 


Absent 






38 


71699-72240 


542 


346769 - 347310 


542 


82 


262559-263101 


543 


73 


93894 - 94430 


537 


65 


39 


75210- 75377 


168 


376498 - 376664 


167 


84' 


Absent 






95795- 95968 


174 


71 


40 


77117-77242 


126 


393019-393146 


128 


69 


322436 - 322565 


130 


60 


Absent 






41 


77293 - 77414 


122 


393172-393292 


121 


62 


322599 - 322709 


111 


50'' 


Absent 







" The open reading frame (ORF) start site of Sox3 is taken as 1 . 

'' Another translational start sites are annotated and conserved only in some mammalian genomes. 

" The case where the sequence was inverted relative to chicken sequence. 

■* The cases where the sequence showed sequence identity vs cliicken below 60%. 
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Table S2. Positioning of the DNA sequences relative to the Sox3 ORF start site, which 
were tested for enhancer activity. 



Sequence 
number 


Positions relative to 
the Sox3 start site (bp) 


Sequence 
length (bp) 


Enhancer activity 
in the CNS (HH 
stages^) 


Non-CNS 
enhancer activity 
THH staees*^ 


Analyzed 
specimens 


U28 


-134079 --129036 


5044 


~ 




3 


U27 


-130087- -125081 


5007 






7 


U26 


-126128 - -121082 


5047 






o 

8 


U25 


-122088 --117083 


5006 






3 


U24 


-118083 --113121 


4963 






4 


U23 


-114084- -109748 


4337 






4 


U22 


-110784- -106313 


4472 






5 


U21 


-107300- -102342 


4959 






9 


U20 


-103342 --98324 


5019 






5 


U19 


-99343 - -94344 


5000 






6 


U18 


-95344 - -90289 


5056 






4 


U17 


-91345 - -86346 


5000 






4 


U16 


-87346 - -82309 


5038 






6 


U15 


-83347 - -78332 


5016 






o 

8 


U14 


-79348 - -74349 


5000 






2 


U13 


-75351 - -70345 


5007 






3 


U12 


-71350- -66311 


5040 






3 


Ull 


-67351 - -62352 


5000 






4 


UIO 


-63400 - -58258 


5143 






5 


U9 


-59365- -54312 


5054 


_ 


- 


3 


U8 


-55353- -50146 


5208 


_ 


- 


6 


U7 


-51354 --46353 


5002 


- 


- 


9 


U6 


-47369 - -42356 


5014 


_ 


- 


8 


U5 


.43364 - -38357 


5008 


_ 


- 


3 


U4 


-39363 - -34454 


4910 


- 




3 


U3 


-35550 - -30442 


5109 


- 


J_/pi Ul cUlC'lllal 

placode (st. 9) 


13 


U2 


-3 1484 --26456 


5029 


- 




3 


Ul 


-27456 - -22435 


5022 








CI 


-23456 - -16912 


6545 






A 

4 


C2 


-22029 - -14489 


7541 






1 


C3 


-16915 - -12683 


4233 






A 

4 


C4 


-16353- -10686 


5668 


- 


(St. 11) 


4 


C5 


-13647 --9325 


4323 




Lens placode 
(St. 11) 


6 


C6 


-10822- -6096 


4727 






2 


C7 


-7397-2548 


9945 






5 


C8 


-2027 - 6462 


8489 






5 


C9 


4748- 15637 


10890 






5 


CIO 


6459- 11599 


5141 






4 


Cll 


13024-21441 


8418 






3 
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Table S2. Cont. 



Sequence 
number 


Positions relative to 
the Sox3 start site (bp) 


Sequence 
length (bp) 


Enhancer activity in the 
CNS (HH stages") 


Non-CNS 
enhancer activity 
(HH stages") 


Analyzed 
specimens 


C12 


16899-29642 


12744 


Diencephalon, 
spinal cord (st.ll) 


- 


8 


C13 


21752-28244 


6493 


Diencephalon, 
spinal cord (st. 11) 




o 
0 


Dl 


28860-33859 


5000 


Posterior neural plate, 
rhombencephalon, 
spinal cord (st. 5) 


Otic placode 
Cst 1 1 ~> 


15 


D2 


32859- 37858 


5000 






3 


D3 


36858 - 41867 


5010 




Otic placode 
(St. 11) 


11 


D4 


40830-45856 


5027 


Telencephalon (st. 9) 




11 


D5 


44847-49855 


5009 


Tel/di/mesencephalon 
(St. 9) 


Lens placode 
(si 1 1 ~> 


12 


D6 


48830-53858 


5029 


Mes/rhombencephalon, 
spinal cord (st. 15) 




8 


D7 


52798 - 57856 


5059 






3 


D8 


56853-61861 


5009 




Epibranchial 
placode (st. 9) 


7 


D9 


60834-65872 


5039 


_ 


Lens (St. 17)'' 


3 


DIO 


64835 - 69850 


5016 


Ventral diencephalon 
(st. 13)'' 


Otic vesicle, 
nasal pit (st. 13)'' 


3 


Dll 


68843 - 73849 


5007 


Spinal cord (st. 9) 




10 


D12 


72849 - 77848 


5000 


Diencephalon (st. 11) 




11 


D13 


76848-81847 


5000 


Posterior neural plate, 
di/rhombencephalon, 
spinal cord (st. 8) 


- 


12 


D14 


80847-85895 


5049 




- 


3 


D15 


84850 - 88760 


3911 


CNS (st. 8) 




3 


D16 


88029-93228 


5200 




- 


3 


D17 


92231-97230 


5000 






5 


CSB19 


-12377- -10964 


1414 




Lens placode 
(St. 11) 


3 


CSB26 


24264-25312 


1049 


Diencephalon, 
spinal cord (st. 11) 




5 


CSB27 


28179-29859 


1681 


Posterior neural plate, 

rhombencephalon, 
spinal cord (st. 5) 


Otic placode 
(St. 11) 


37 



" The earliest developmental stages recorded for the EGFP expression are indicated. 
*" These showed the low enhancer activities, and data were not shown in this text. 
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